Guard controller by ingvagabund · Pull Request #1238 · openshift/library-go

ingvagabund · 2021-10-27T10:00:47Z

Deploy guarding pods with a PDB which will make sure a node does not drain a node until at most one replica of an operand is unavailable. Each guarding pod checks healthz status of an operand on the same node.

The guard controller repeatedly checks available master nodes and for each one renders an unmanaged guard pod. Each guard pod is responsible for checking if corresponding operand (operand pod running on the same node as the guard pod) is ready (through checking the /healthz endpoint). The guard uses an https probe to check readiness of its operand. At the same time a pdb with minAvailable set to len(masters)-1 is rendered. Each time the number of master nodes changes, the minAvailable field is updated accordingly.

Outstanding facts:

guard pod has its nodename set directly (in case the KS is down)
the Guard controller can delete pods (in case image/probe host changes)
the Guard controller requires create/delete condition checkers (in case the number of master nodes is scaled to 1 to remove the PDB)

Applied and tested in:

deads2k · 2021-11-01T21:38:56Z

pkg/operator/resource/resourceapply/generic.go

+			} else {
+				_, result.Changed, result.Error = DeleteStorageVersionMigration(ctx, clients.migrationClient, recorder, t)
+			}
+		case *unstructured.Unstructured:


note that in combination with a RESTMapper, we'd be able to delete any manfiest we can read with a one to one mapping of resources to kind.

Doesn't have to be in this PR, but seems like a valuable future thing.

deads2k · 2021-11-01T21:40:38Z

move Allow resource deletion based on a certain condition to its own commit so I can merge it

ingvagabund · 2021-11-10T15:44:21Z

/retest

pkg/operator/staticpod/controller/guard/guard_controller.go

soltysh

Left some questions, in general this looks good.

soltysh · 2021-12-01T12:15:27Z

pkg/operator/staticpod/controller/guard/bindata/bindata.go

@@ -0,0 +1,313 @@
+// Code generated for package bindata by go-bindata DO NOT EDIT. (@generated)


Oh my, we got rid of go-bindata from all the operators but library-go is still left with one 😞 we'll should fix that...

That will require more changes. Better to do it in another PR.

Definitely a separate PR, just an observation 😉

soltysh · 2021-12-01T12:20:32Z

pkg/operator/staticpod/controller/guard/guard_controller.go

+		for _, node := range nodes {
+			if _, exists := operands[node.Name]; !exists {
+				klog.Errorf("Missing operand on node %v", node.Name)
+				errs = append(errs, fmt.Errorf("Missing operand on node %v", node.Name))


During reboots or similar problems we'll be throwing a lot of errors, I don't think that's the intention. Shouldn't we filter out NotReady nodes earlier in the node query? IIRC installer ignores not ready nodes.

Good point. In case the operand pod does not exist, I will skip in case the underlying node is not ready. That might help reduce the number of errors. Also, once a node condition changes, the sync loop gets triggered again so no need to return an error here.

soltysh · 2021-12-01T12:50:36Z

pkg/operator/staticpod/controller/guard/guard_controller.go

+		pdbGetter:             pdbGetter,
+		installerPodImageFn:   getInstallerPodImageFromEnv,
+		createConditionalFunc: createConditionalFunc,
+		deleteConditionalFunc: deleteConditionalFunc,


Looking at openshift/cluster-kube-scheduler-operator#373 I see this is only used for SNO check, any other particular reason you want to split into create and delete condition? If that's only SNO, why not having a single check?

Since create is not always of !delete (not delete). Also openshift/cluster-kube-controller-manager-operator#568 (comment)

Right, but returning bool, errror tuple is an option you also mention there.

soltysh · 2021-12-01T12:54:42Z

pkg/operator/staticpod/controller/guard/guard_controller.go

+
+			klog.V(5).Infof("Rendering guard pod for operand %v on node %v", operands[node.Name].Name, node.Name)
+
+			pod := resourceread.ReadPodV1OrDie(bindata.MustAsset(filepath.Join("pkg/operator/staticpod/controller/guard", "manifests/guard-pod.yaml")))


This seems like a lot of unnecessary reads, why not just read it once when we create the controller and then just update the internal representation as needed and apply.

bindata.MustAsset already has the internal representation as it is. So it's just a convenient wrapper around it.

soltysh

Two nits about ConditionalFunc, but those can be either addressed in followup or I can be overruled by majority 😉
/hold

/lgtm

soltysh · 2021-12-07T11:30:23Z

pkg/operator/staticpod/controller/guard/guard_controller.go

+	return utilerrors.NewAggregate(errs)
+}
+
+func WithIsSNOCheck(infraLister configv1listers.InfrastructureLister, neg bool) resourceapply.ConditionalFunction {


Do we really need neg parameter to negate the result, it's not that hard to add ! before the invocation, is it?

soltysh

/lgtm
/hold cancel

openshift-bot · 2021-12-16T22:00:48Z

/retest-required